Let’s analyze the answers of the GSS question “Taken all together, how would you say things are these days–would you say that you are very happy, pretty happy, or not too happy?”. GSS calls this variable “HAPPY”. You can find (and search) a list of all questions included on the GSS at the GSS Data Explorer, or, which I prefer, at the Survey Documentation and Analysis (SDA) website. The latter one shows more information on how the variable was coded.
GSS2016.happy <- GSS2016 %>% select(happy)
summary(GSS2016.happy$happy)
## very happy pretty happy not too happy DK IAP
## 806 1601 452 0 0
## NA NA's
## 0 8
We see a bunch of zero cells (but the label was retained) and 8 NA’s. Let’s get ride of all these:
GSS2016.happy.clean <- GSS2016.happy %>% drop_na() %>% droplevels()
summary(GSS2016.happy.clean)
## happy
## very happy : 806
## pretty happy :1601
## not too happy: 452
Load the plotly library:
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Creat and interactive barchart. We will first do this with ggplot, and then call gg_plotly() to make it interactive:
p <- ggplot(data = GSS2016.happy.clean, aes(x = happy)) + geom_bar() # showing percentages
p
ggplotly(p)
You can now perform several interactive tasks on this graph, such as hovering over a bar to display information, zoom in, change the x or y axis, save it as a .png, etc. The amazig thing is that this doenss’t just work within R Studio, but right there on the HTML that you knitted. Which means you can embed this graph in websites and still have all these interactions.
We used the ggplotly() function to translate the ggplot graph into a plotly graph. However, I want to show “native”" programming with plotly as well. The reason is that not only is there a translation of the plotly.js javascript library into R, but there is also one into python, and the only difference is the different syntax between R and python, which is not that hard to overcome.
Here is how you get the barchart using the native plot_ly() function of the plotly package, without going through ggplot. You will immediately notice some style differences:
plot_ly(data = GSS2016.happy.clean, x = ~happy)
## No trace type specified:
## Based on info supplied, a 'histogram' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#histogram
That’s impressive right out of the box, but there were some complaints, such as “No trace type specified: Based on info supplied, a ‘histogram’ trace seems appropriate.” I actually want a bar chart, so I’d better tell plot_ly:
plot_ly(data = GSS2016.happy.clean, x = ~happy, type = "bar")
Ups, this didn’t work out, so it seems the histogram trace was appropriate:
plot_ly(data = GSS2016.happy.clean, x = ~happy, type = "histogram")
See, no complaints anymore. It turns out, the bar trace can only be used on summarized data, i.e., when we just have the categories in one column, and the counts (or proportions) in the other column. Like this (note how I now provide both x and y variables in the plot_ly() call):
plotdata <- GSS2016.happy.clean %>% group_by(happy) %>% summarize(count = n()) %>%
ungroup
plot_ly(data = plotdata, x = ~happy, y = ~count, type = "bar")
At this point, it’s worth pointing out the most important help pages when it comes to plotly. First, there is https://plot.ly/r/, which gives you a whole gallery of charts that you can click on to learn how they are created with plotly (i.e., go through a tutorial). The other one is Carston Sievert’s (the author of the plotly package) Plotly Cookbook. At times, it is a bit outdated, but I hear a second edition is coming out very soon! There is also a plotly cheat-sheet here. However, to me, the most useful page is the documentation page of the various fnctions at https://plot.ly/r/reference/.
How about editing the axis labels? This is done using the layout() command:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(xaxis = list(title = "Happiness"),
yaxis = list(title = "Frequency"))
What about editing the y-axis range? This is just another argument in the list that defines the yaxis, but which one? ggplot uses limits, but this is not the case for plotly. Better check the reference at https://plot.ly/r/reference/ and click on layout and y axis. Aha, range is what we are looking for:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(xaxis = list(title = "Happiness"),
yaxis = list(title = "Frequency", range = c(0, 1810)))
What about plotting the bar chart horizontally? With a bar trace, this is as simple as switching the x and y variables and using orientation='h' (I also got ride of the y-axis label):
plot_ly(plotdata, y = ~happy, x = ~count, type = "bar", orientation = "h") %>%
layout(yaxis = list(title = ""), xaxis = list(title = "Frequency",
range = c(0, 1810)))
Here, the y-axis labels could really go over two lines:
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, type = "bar",
orientation = "h") %>% layout(yaxis = list(title = ""), xaxis = list(title = "Frequency",
range = c(0, 1810)))
OK, but plotly doesn’t draw ticks. If we want them, we can add them:
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, type = "bar",
orientation = "h") %>% layout(yaxis = list(title = "", ticks = "outside"),
xaxis = list(title = "Frequency", range = c(0, 1810)))
Changing colors of the bars:
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, color = "orange",
type = "bar", orientation = "h") %>% layout(yaxis = list(title = "",
ticks = "outside"), xaxis = list(title = "Frequency", range = c(0,
1810)))
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
However, similar to ggplot, we can map a variable to color. (You might have realized that in the code above I did not have a tilde “~” sign for color, i.e., it was fixed and not mapped to anything.)
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, color = ~happy,
type = "bar", orientation = "h") %>% layout(yaxis = list(title = "",
ticks = "outside"), xaxis = list(title = "Frequency", range = c(0,
1810)))
Just like in ggplot, this does produce a legend, which in plotly is interactive. Clicking on a category in the legend hides the corresponding category (It doesn’t recompute proportions, though)!
We can also hide the legend:
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, color = ~happy,
type = "bar", orientation = "h") %>% layout(showlegend = FALSE,
yaxis = list(title = "", ticks = "outside"), xaxis = list(title = "Frequency",
range = c(0, 1810)))
Apropose hiding, we can hide the whole interactive suit (zooming, etc.) that appears on the top of every plotly plot using config() (this is burried deep in the documnetation):
library(stringr)
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, color = ~happy,
type = "bar", orientation = "h") %>% layout(showlegend = FALSE,
yaxis = list(title = "", ticks = "outside"), xaxis = list(title = "Frequency",
range = c(0, 1810))) %>% config(collaborate = FALSE,
displaylogo = FALSE, modeBarButtonsToRemove = list("resetScale2d",
"sendDataToCloud", "zoom2d", "zoomIn2d", "zoomOut2d",
"pan2d", "select2d", "lasso2d", "hoverClosestCartesian",
"hoverCompareCartesian", "hoverClosestGl2d", "hoverClosestPie",
"toggleHover", "resetViews", "toggleSpikelines"))
markers to modify colorWhat if you want to override the default color choice for the bars? Here, the marker command helps, which gives fine control over the way the markers (=bars) look:
# add color info to datase
library(RColorBrewer)
plotdata$mycol <- brewer.pal(12, "Set3")[c(12, 7, 5)]
plotdata
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, type = "bar",
orientation = "h", marker = list(color = ~mycol)) %>% layout(showlegend = FALSE,
yaxis = list(title = "", ticks = "outside"), xaxis = list(title = "Frequency",
range = c(0, 1810)))
The marker command has other options, such as plotting a line around the bars:
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, type = "bar",
orientation = "h", marker = list(color = ~mycol, line = list(color = "black",
width = 1.5))) %>% layout(showlegend = FALSE, yaxis = list(title = "",
ticks = "outside"), xaxis = list(title = "Frequency", range = c(0,
1810)))
This is unfortunately not as straightforward as just using title= and subtitle=. In fact, it is an annotation to the plot in layout(), but first the margins have to be adjusted via margin to make room for the title and subtitle. Let’s demonstrate with the vertical bar chart:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(xaxis = list(title = "Happiness"),
yaxis = list(title = "Frequency", range = c(0, 1810)), margin = list(t = 80),
annotations = list(text = "Barchart of Happiness", showarrow = FALSE,
font = list(size = 19), x = 0.5, xref = "paper", xanchor = "center",
y = 1.2, yref = "paper"))
Adding a subtitle is just adding a second annotation:
n <- sum(plotdata$count)
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(xaxis = list(title = "Happiness"),
yaxis = list(title = "Frequency", range = c(0, 1810)), margin = list(t = 80),
annotations = list(text = "Barchart of Happiness", showarrow = FALSE,
font = list(size = 19), x = 0.5, xref = "paper", xanchor = "center",
y = 1.2, yref = "paper")) %>% add_annotations(text = paste0("Based on the 2016 GSS with ",
n, " respondents"), showarrow = FALSE, font = list(size = 16),
x = 0.5, xref = "paper", xanchor = "center", y = 1.12, yref = "paper")
OK, let’s try adding a title and subtitle to the vertical bar chart, but this time on the left side:
plot_ly(plotdata, y = ~str_wrap(happy, 7), x = ~count, type = "bar",
orientation = "h", marker = list(color = ~mycol)) %>% layout(showlegend = FALSE,
yaxis = list(title = "", ticks = "outside"), xaxis = list(title = "Frequency",
range = c(0, 1810)), margin = list(t = 80), annotations = list(text = "Barchart of Happiness",
showarrow = FALSE, font = list(size = 19), x = 0, xref = "paper",
xanchor = "left", y = 1.2, yref = "paper")) %>% add_annotations(text = paste0("Based on the 2016 GSS with ",
n, " respondents"), showarrow = FALSE, font = list(size = 16),
x = 0, xref = "paper", xanchor = "left", y = 1.12, yref = "paper")
First, we can choose between different hover modes interactively in the chart. In the graph above, click on the “Compare Data on Hover” symbol in the top of the chart, then you will see the hover mode changing. We can also force this hover mode programmatically. Let’s illustrate with a basic chart. The default is:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(hovermode = "closest")
Showing a different mode with labeling the x and y value on hover:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar") %>% layout(hovermode = "x+y")
Changing the text that is shown on hover:
plot_ly(plotdata, x = ~happy, y = ~count, type = "bar", hovertext = paste("<b>Count:</b>",
plotdata$count), hoverinfo = "x+text") %>% layout(hovermode = "x+y")
Showing both count and percentages on hover (and plotting percentages on the y-axis):
plot_ly(plotdata, x = ~happy, y = ~100 * (count/n), type = "bar",
hovertext = paste("<b>Count:</b>", plotdata$count, "<br><b>Percent:</b>",
paste0(round(100 * plotdata$count/n, 2), "%")), hoverinfo = "x+text") %>%
layout(yaxis = list(title = "Percent (%)"), hovermode = "x+y")
First, let’s color the bars:
plot_ly(plotdata, x = ~happy, y = ~100 * (count/n), color = ~happy,
type = "bar", hovertext = paste("<b>Count:</b>", plotdata$count,
"<br><b>Percent:</b>", paste0(round(100 * plotdata$count/n,
2), "%")), hoverinfo = "x+text") %>% layout(yaxis = list(title = "Percent (%)"),
hovermode = "x+y")
Now, suppose we want to turn this into a stacked bar chart. This means we do not have labels along the x-axis. Rather, the coloring of the bars will tell us which group we are in. Technically, we need to create an artifical grouping variable. I will just call it “1”:
plot_ly(plotdata, x = 1, y = ~100 * (count/n), color = ~happy,
type = "bar", hovertext = paste("<b>Count:</b>", plotdata$count,
"<br><b>Percent:</b>", paste0(round(100 * plotdata$count/n,
2), "%")), hoverinfo = "x+text") %>% layout(yaxis = list(title = "Percent (%)"),
hovermode = "x+y")
Almost there. Unfortunately, plotly didn’t put the bars on top of each other but next to each other. Using barmode="stack" in layout() takes care of this. Also, note how I changed hoverinfo and hovermode to get nice hover effects:
plot_ly(plotdata, x = 1, y = ~100 * (count/n), color = ~happy,
type = "bar", hovertext = paste("<b>Count:</b>", plotdata$count,
"<br><b>Percent:</b>", paste0(round(100 * plotdata$count/n,
2), "%")), hoverinfo = "text") %>% layout(yaxis = list(title = "Percent (%)"),
hovermode = "closest", barmode = "stack")
Finally, I want to flip the chart and use a vertical legend. We can access legend attributes via legend= in layout(). Here is how:
plot_ly(plotdata, y = 1, x = ~100 * (count/n), color = ~happy,
type = "bar", orientation = "h", hovertext = paste("<b>Count:</b>",
plotdata$count, "<br><b>Percent:</b>", paste0(round(100 *
plotdata$count/n, 2), "%")), hoverinfo = "text") %>%
layout(xaxis = list(title = "Percent (%)"), hovermode = "closest",
barmode = "stack", legend = list(orientation = "h", traceorder = "normal"))
Hmm, that put the legend in the bottom, which is plotly default behavior for orientation='h'. You can put the legend on top, by specifying x=, xanchor=, y= and yanchor=. I also adjusted the top margin a bit to make some room.
plot_ly(plotdata, y = 1, x = ~100 * (count/n), color = ~happy,
type = "bar", orientation = "h", hovertext = paste("<b>Count:</b>",
plotdata$count, "<br><b>Percent:</b>", paste0(round(100 *
plotdata$count/n, 2), "%")), hoverinfo = "text") %>%
layout(xaxis = list(title = "Percent (%)"), hovermode = "closest",
barmode = "stack", margin = list(t = 80), legend = list(orientation = "h",
traceorder = "normal", x = 0, xanchor = "left", y = 1.4,
yanchor = "left"))
The final modification I will do here is hide the y-axis:
plot_ly(plotdata, y = 1, x = ~100 * (count/n), color = ~happy,
type = "bar", orientation = "h", hovertext = paste("<b>Count:</b>",
plotdata$count, "<br><b>Percent:</b>", paste0(round(100 *
plotdata$count/n, 2), "%")), hoverinfo = "text") %>%
layout(xaxis = list(title = "Percent (%)"), yaxis = list(visible = FALSE),
hovermode = "closest", barmode = "stack", margin = list(t = 80),
legend = list(orientation = "h", traceorder = "normal",
x = 0, xanchor = "left", y = 1.4, yanchor = "left"))
Some modifications to consider: Add a title and perhaps add a legend title. Both can be done with annotations.
(Your best solution here!)
(Your best solution here!)
(Your best solution here!)